# Codebook: Harmonized WVS & ISSP Dataset with Macro Variables

**File:** `final_analysis_dataset_complete.parquet`  
**Location:** `data/final_analysis_dataset_complete.parquet`  
**Description:**  
This dataset contains individual-level observations from selected waves of the World Values Survey (WVS: Waves 1–7, approx. 1981–2022) and the International Social Survey Programme (ISSP: Waves 1988, 1994, 2002, 2012), merged with country-year level macro indicators. Core variables related to demographics, education, gender attitudes, and macro context have been harmonized.  

**Dimensions:** 597,217 rows × 13 columns  

---

## Variable Descriptions

### `C_ALPHAN`  
- **Description:** ISO 3166-1 Alpha-3 country code (harmonized).  
- **Source:** WVS S003/COUNTRY_ALPHA; ISSP V3(1988)/v3(1994)/C_ALPHAN(2002)/V4(2012).  
- **Type:** `string` (nullable)  
- **Missing:** `<NA>` (~2.99%)

### `YEAR`  
- **Description:** Survey year.  
- **Source:** WVS S020; ISSP wave year.  
- **Type:** `Int64` (nullable)  
- **Missing:** None

### `CASEID`  
- **Description:** Respondent ID (unique within wave).  
- **Source:** WVS S007; ISSP V2/v2/v3/CASEID.  
- **Type:** `Float64` (nullable)  
- **Missing:** None

### `Source`  
- **Description:** Survey origin.  
- **Values:** `"WVS"`, `"ISSP"`  
- **Type:** `object`  
- **Missing:** None

### `SEX`  
- **Description:** 1 = Female, 2 = Male.  
- **Source:** WVS X001; ISSP V65/v200/SEX.  
- **Type:** `Int64` (nullable)  
- **Missing:** `<NA>` (~0.83%)

### `AGE`  
- **Description:** Age in years (cleaned).  
- **Source:** WVS X003; ISSP V66/v201/AGE.  
- **Type:** `Float64` (nullable)  
- **Missing:** `<NA>` (~0.25%)

### `educ_level`  
- **Description:** 1 = Low, 2 = Medium, 3 = High.  
- **Source:** WVS X025R; ISSP EDUCYRS/v204.  
- **Type:** `Int64` (nullable)  
- **Missing:** `<NA>` (~10.47%)

### `egal_index`  
- **Description:** Mean of 5 gender-attitude items (1–5). Higher = more egalitarian.  
- **Source:** Harmonized WVS & ISSP attitude items.  
- **Type:** `Float64` (nullable)  
- **Missing:** `<NA>` (~6.61%)

### `libdem_index`  
- **Description:** V-Dem Liberal Democracy (0–1).  
- **Source:** V-Dem v15 (v2x_libdem).  
- **Type:** `Float64` (nullable)  
- **Missing:** `<NA>` (~8.18%)

### `polyarchy_index`  
- **Description:** V-Dem Electoral Democracy (0–1).  
- **Source:** V-Dem v15 (v2x_polyarchy).  
- **Type:** `Float64` (nullable)  
- **Missing:** `<NA>` (~8.18%)

### `gii_index`  
- **Description:** UNDP Gender Inequality Index (0–1).  
- **Source:** UNDP HDR23-24.  
- **Type:** `float64`  
- **Missing:** `NaN` (~19.10%)

### `GDP_per_capita_PPP`  
- **Description:** PPP-adjusted GDP per capita (2017 int’l $).  
- **Source:** World Bank WDI (`macro_data_WDI.csv`).  
- **Type:** `float64`  
- **Missing:** `NaN` (~16.90%)

### `log_gdp_ppp`  
- **Description:** Natural log of `GDP_per_capita_PPP`.  
- **Type:** `Float64` (nullable)  
- **Missing:** `<NA>` (~16.90%)
